Audio signal processing with adaptive noise-shaping modulation

ABSTRACT

Processing an audio signal is provided, which processing comprises conversion of the audio signal into a digital signal by a noise-shaping modulation, compressive encoding of the digital signal at a predetermined sampling rate into a compressed digital signal, and supplying the compressed digital signal, wherein the noise-shaping modulation is adaptive in response to at least one parameter.

[0001] The present invention relates to processing an audio signal, e.g. for recording or transmission, the processing comprising the steps of conversion of the audio signal into a digital signal by a noise-shaping modulation, compressive encoding of the digital signal at a predetermined sampling rate into a compressed digital signal, and supplying the compressed digital signal.

[0002] International Patent Application WO 98/16014 discloses a data compression apparatus for data compressing an audio signal. The data compression apparatus comprises an input terminal for receiving the audio signal, a 1-bit A/D converter for A/D converting the audio signal so as to obtain a bitstream signal, a lossless coder for carrying out a lossless data compression step on the bit-stream signal so as to obtain a data compressed bit-stream signal, and an output terminal for supplying the data compressed bit-stream signal. Further, a recording apparatus and a transmitter apparatus comprising the data compression apparatus are disclosed. In addition, a data expansion apparatus for data expanding the data compressed bit-stream signal supplied by the data compression apparatus is disclosed, as well as a reproducing apparatus and a receiver apparatus comprising the data expansion apparatus.

[0003] It is an object of the invention to provide advantageous compression. To this end, the invention provides a signal processing method and device and an apparatus for recording or transmission as defined in the independent claims. Advantageous embodiments are defined in the dependent claims.

[0004] According to a first aspect of the invention, the noise-shaping modulation is adaptive in response to at least one parameter. The invention is based on the recognition that by making the noise-shaping modulation adaptive, the compression gain of the encoder can be influenced. This is because a change in the noise-shaping modulation influences the correlation within the audio signal. Higher correlated signals can be better predicted an thus be better compressed. This aspect of the invention is especially advantageous for lossless encoders such as used in the encoding of Direct Stream Digital (DSD) signals e.g. for storing on Super Audio Compact Disc (SACD).

[0005] By using adaptive sigma-delta modulation in the noise-shaping modulation, an increase of compression gain can be obtained by giving in on dynamic range. Listening tests have demonstrated that the huge dynamic range of the SACD re-cording medium appears to be less important in the sense that e.g. a reduction of the dynamic range from 105 dB to 95 dB would hardly be perceivable. Particularly at high signal levels a listener will due to masking effects in general be insensitive to a slight reduction in dynamic range. Experiments have revealed that several ways exist, by which the structure of a sigma-delta modulator can be adapted or modified to provide a higher compression gain from the encoding algorithm, such as use of a lower order sigma-delta modulator and/or creating structure in the high frequency noise of the modulator.

[0006] In an advantageous embodiment of the invention the conversion of the audio signal into the digital signal includes low-pass filtering of the audio signal followed by an adaptive noise-shaping modulation (e.g. sigma-delta modulation). Thereby, a further increase of compression gain may be obtained, but to a certain extent at the expense of a signal quality degradation caused by the bandwidth limitation resulting from the low-pass filtering.

[0007] The input audio signal may be supplied as an analog signal, whereby the adaptive sigma-delta modulation is conducted as part of the noise-shaping modulation, by which the audio signal is converted into a digital signal such as 1 bit bit-stream signal as prescribed by the DSD signal format

[0008] The audio signal may alternatively be supplied to the conversion as a digital signal such as a 1 bit bitstream signal, which may be obtained by initial oversampling of an analog audio signal at a rate, which is a multiple of the predetermined sampling rate for the compressive encoding. In connection with the above-mentioned preferred embodiment the low pass filtering and noise-shaping modulation may thereby include downsampling of the 1 bit bitstream signal to the predetermined sampling rate. Thus, with a predetermined sampling rate of 64 times the sampling frequency of 44.1 kHz the oversampling could be conducted at a rate of 256 times the sampling frequency. At this sampling level any signal processing can be effected.

[0009] In the following the invention will be further explained with reference to the accompanying drawings, in which

[0010]FIGS. 1 and 2 are simplified schematic block diagrams of two alternative embodiments of a signal processing apparatus according to the invention,

[0011] FIGS. 3 to 5 are diagrams illustrating alternative ways of implementing sigma-delta modulation and/or low-pass filtering in response to a parameter of the audio signal,

[0012]FIG. 6 is a simplified topology diagram of a 5th order sigma-delta modulator for use in any of the alternative configurations in FIGS. 1 and 2,

[0013]FIG. 7 is a graphic representation of compression gain for various orders of sigma-delta modulators,

[0014]FIG. 8 is a graphic representation of the effect of adding an extra pole in a high frequency range to the sigma-delta modulator, and

[0015]FIG. 9 is a graphic representation of the relationship between compression gain and signal power in a selected frequency band of an audio signal.

[0016] In the diagram in FIG. 1 an analog input audio signal is supplied to a converter 1 comprising a noise-shaping modulator 2, from which a digital signal is supplied to a lossless encoder 3. The modulator 2 may typically be a sigma-delta modulator supplying the digital signal in form of a bit-stream signal such as a 1 bit bitstream signal in the DSD format.

[0017] The lossless encoder 3 may typically have a structure incorporating framing, whereby the input signal supplied to it, is split up in small parts enabling the encoder to exploit the short-term pseudo-stationary properties of the audio signal as well as pseudo-stationary properties of the quantization errors of the sigma-delta modulator 1 and prediction, e.g. by means of a linear FIR filter 4, to remove the dependencies or redundancy between successive source samples as much as possible before the coding, which may be conducted in the form of variable length entropy encoding, e.g. using Huffman-like coding algorithms, or arithmetic encoding.

[0018] Thereby, the encoder 3 supplies a compressed digital signal which as shown may be supplied for re-cording on a record carrier such as a SACD disc, but may also be used e.g. for transmission via a transmission medium.

[0019] In the configuration shown in FIG. 1 the compression gain of the compressed lossless encoded signal supplied by the encoder 2 is increased in accordance with an embodiment of the invention by adaptation or modification of the sigma-delta modulator 1 in response to a parameter P. As will appear from the following description several approaches can be used, according to the invention, for such an adaptation or modification of the structure of the sigma-delta modulator such as use of a lower order modulator or creating structure in the high frequency noise of the modulator.

[0020] In the alternative configuration in FIG. 2 a digital audio input signal is supplied to a converter 5 before being supplied to the lossless encoder 3. The converter 5 includes a low-pass filter 6, by which the bandwidth of the input signal is limited, e.g. to 100 kHz in conformity with the bandwidth specification of the DSD format or even to 50 kHz, followed by an adaptive sigma-delta modulator 7. Although not strictly necessary, also the low-pass filter 7 is preferably made adaptive in response to at least one parameter of the audio signal, which would preferably be the same as, but could also be different from the signal parameter used for the adaption or modification of the sigma-delta modulator 7. In a simple embodiment, in the case the low-pass filter 7 is adaptive, the sigma-delta modulator may be non-adaptive.

[0021] The combination of low-pass filter 6 and adaptive sigma-delta modulator 7 in the converter 5 provides for requantisization of the digital input signal. The signal processing apparatus as shown in FIG. 2 may comprise several successive pre-processing blocks 5 to achieve a desired increase of the compression gain.

[0022] The low-pass filter 6 in the converter 5 may e.g. be a 7th order IIR Chesbyshev type 1 filter and generally the compression gain increase obtained by one or more pre-processing stages as shown in FIG. 2 will be higher than for the configuration in FIG. 1, which may also result, however, in some quality degradation of the signal due to the bandwidth limitation.

[0023] Obviously, one or more converters 5 as shown in FIG. 2 may also be used in the configuration shown in FIG. 1 between the modulator 1 and the encoder 2.

[0024] The adaptive sigma-delta modulator 1 or 7 may be of the 3rd, 5th or 7th order to provide compression gains ranging from 3.7 or higher for a 3rd order modulator down to only 2.3 or lower for a 7th order modulator as illustrated in the graphical representation in FIG. 7. It should be emphasized, however, that in general the use of a lower order modulator will result in degradation of audio quality due to a lower dynamic range in the audio band.

[0025] According to an embodiment of the invention, the sigma-delta modulation in modulator 2 is adapted or modified in response to at least one parameter P of the audio signal in order to confine increase of the compression gain to parts of the lossless encoded signal, for which this is needed. This would typically be at high signal levels, where the compression provided by the encoder 3 will usually drop. As shown in FIG. 3, this may be implemented by means of a feed-back loop 9 incorporating a signal level detector 10. Alternatively, the adaption may as shown in FIG. 4 comprise a control device 11 responding to data obtained from the prediction filter 4 in the encoder 3 or, as shown in FIG. 5 a control signal obtained from a signal power extractor and correlator 12, as will be further explained in the following.

[0026] The diagram in FIG. 6 shows a preferred topology of a 5th order sigma-delta modulator for use in any of the configurations in FIGS. 1-5. The illustrated topology is based on a multiple resonator structure, in which the coefficients c1, c2, . . . c5 in the feed-back loops of resonators R1, R2, . . . R5 determine the poles of the loop filters (or zeroes of the noise transfer function). Whereas the illustrated topology is for a 5th order modulator the same topology may be used for a 7th order modulator just by adding another resonator structure.

[0027] As mentioned above, FIG. 7 shows a graphic representation of compression gain cg for various orders of sigma-delta modulators as a function of amplitude swa for a 10 kHz audio sine wave signal to illustrate the in-crease in compression gain for lower order modulators, which is obtained, however, at the expense of an increased quantisization noise in the audio band.

[0028] In ordinary design of a modulator the poles will normally be positioned in the audio band, According to a further embodiment of the invention it is preferred, however, as shown in the graphic representation in FIG. 8 of compression gain for various signal as a function of the pole position pp for a 5th order sigma-delta modulator, to have at least one pole positioned outside the audio band to create additional structure in the—otherwise almost flat—high frequency part of the sigma-delta spectrum.

[0029] In standard designs of sigma-delta modulators the poles are typically positioned at 8.7, 15.7 and 19.5 kHz, whereas in accordance with the invention the last pole is preferably shifted from the 20 kHz region to higher frequencies. As will appear from the diagram, positioning of the pole around 200 kHz may result in a rather bad compression gain, because this pole position is too close to the point where the modulator will change from 5th to 1st order behavior, whereby the modulator becomes almost unstable.

[0030] On the other hand positioning of this pole around 300 kHz or higher may lead to a significant increase of compression gain. This may be accompanied by a slight decrease of the signal-to-noise performance, which will be quite acceptable for the adapted modulator, however, because the extra noise is introduced on the high side of the frequency band, where the human ear is less sensitive.

[0031] The shifting of the pole position from the 20 kHz region towards higher frequencies can be effected by addition of a separate extra band pass filter to the-existing modulator structure, e.g. in parallel to the low-pass loop filter. By use of a 2nd order Butter-worth band pass filter for such a parallel filter a significant increase of compression gain can be realized with the resulting modulator remaining stable for large inputs and the signal-to-noise performance in the audio band remaining virtually unchanged with respect to an unmodified modulator

[0032] According to the invention a further approach as shown in FIG. 5 for the adaptation of the adaptive sigma-delta modulator and/or the adaptive low-pass filter in the pre-processing device is to provide an estimate of the amount of data that can be stored on the recording medium such as a SACD disc and use such an estimate for the adaptive control of the sigma-delta modulator and/or the low-pass filter.

[0033] In theory, to provide such an estimate it could be chosen to determine compression gains only for, e.g. randomly, selected subset of music recordings and use this estimate as an average gain indication for a whole piece of music.

[0034] In view of the fact, however, that typical pieces of music have a very wide coverage of gains with significant short-time correlations, a very significant fraction of the piece of music would have to be used to obtain an estimate by this approach with the required precision. Due to the amount of computation that would inevitably be required for such an operation this approach could not be seen as an acceptable solution.

[0035] According to the invention a correlation between the signal power of the bitstream signal in the DSD format and the compression gain is used to provide the desired estimate.

[0036] Whereas investigations have demonstrated that in the audio signal band, e.g. up to 20 kHz, itself the correlation is very weak due to a very flat response curve for the compression gain as function of signal power, a fully usable correlation resulting from a very steep response curve as illustrated in the graphic representation in FIG. 9 can be observed by shifting to a frequency band just above the normal audible range, e.g. from 20 to 50 kHz. Preliminary limited experiments have revealed that in this way estimates with an accuracy within 1% can be obtained.

[0037] It should be noted that the above-mentioned embodiments illustrate rather than limit the invention, and that those skilled in the art will be able to design many alternative embodiments without departing from the scope of the appended claims. In the claims, any reference signs placed between parentheses shall not be construed as limiting the claim. The word ‘comprising’ does not exclude the presence of other elements or steps than those listed in a claim. The invention can be implemented by means of hardware comprising several distinct elements, and by means of a suitably programmed computer. In a device claim enumerating several means, several of these means can be embodied by one and the same item of hardware. The mere fact that certain measures are recited in mutually different dependent claims does not indicate that a combination of these measures cannot be used to advantage. 

1. A method of processing an audio signal, the method comprising the steps of: conversion of the audio signal into a digital signal by a noise-shaping modulation, compressive encoding of the digital signal at a predetermined sampling rate into a compressed digital signal, and supplying the compressed digital signal, the method being characterized in that the noise-shaping modulation is adaptive in response to at least one parameter.
 2. A method as claimed in claim 1, wherein said conversion step includes low-pass filtering said audio signal prior to said adaptive noise-shaping modulation.
 3. A method as claimed in claim 2, wherein said low-pass filtering is adaptive in response to at least one parameter.
 4. A method as claimed in claim 3, wherein said adaptive noise-shaping modulation and/or said adaptive low-pass filtering is controlled by feed-back control, said at least one parameter comprising the signal level obtained from said digital signal.
 5. A method as claimed in claim 1, wherein the adaptive noise-shaping modulation comprises an adaptive low-pass filtering prior to a non-adaptive sigma-delta modulation.
 6. A method as claimed in claim 1, wherein said compressive encoding comprises linear prediction filtering of said digital signal, and wherein said at least one parameter is based on data obtained from said prediction filtering.
 7. A method as claimed in claim 1, wherein said at least one parameter comprises a signal power in a selected frequency band of said digital signal.
 8. A method as claimed in claim 7, wherein said selected frequency band is above 20 kHz.
 9. A method as claimed in claim 1, wherein the adaptive noise-shaping modulation comprises an adaptive sigma-delta modulation having at least one pole above 20 kHz.
 10. A method as claimed in claim 9, wherein said pole is positioned in the high frequency range from 300 kHz and above.
 11. A method as claimed in claim 1, wherein the adaptive noise-shaping modulation is of a multiple resonator structure with a loop filter acting as a band-pass filter in parallel with a low-pass filter.
 12. An device for processing an audio signal, the device comprising: means for conversion of the audio signal into a digital signal by a noise-shaping modulation, means for compressive encoding of the digital signal at a predetermined sampling rate into a compressed digital signal, and means for supplying the compressed digital signal, the apparatus being characterized in that the noise-shaping modulation is adaptive in response to at least one parameter.
 13. An apparatus for transmitting or recording an audio signal, the recording apparatus comprising: an input unit to obtain an audio signal, an audio signal processing device as claimed in claim 12 to process the audio signal to obtain a processed audio signal, an output unit for outputting the processed audio signal. 